Everything is executed on single node, average of 10 runs.

Metrics

Some of reported HPCC metrics:

Results

Overall Performance (That is per All MPI Processes)

High Performance LINPACK Floating-Point Performance, MFLOP per Second

High Performance LINPACK Floating-Point Performance, MFLOP per Second
config cores average stdev
Ookami; cray,libsci,fftw-cray,mvapich 48 798412.2 5841.94
Ookami; cray,libsci,fftw-cray,mvapich 96 1170094.0 8101.97
Ookami; cray,libsci,fftw-cray,mvapich 192 1636268.0 38642.51
Ookami; cray,libsci,fftw-cray,mvapich 384 2730327.8 122638.49
Ookami; cray,libsci,fftw-cray,mvapich 768 3960172.2 260851.89
Ookami; gcc,armpl,mvapich 48 969092.4 10208.34
Ookami; gcc,armpl,mvapich 96 1310979.1 76321.64
Ookami; gcc,armpl,mvapich 192 1701196.4 62641.53
Ookami; gcc,openblas,fftw-rd,mvapich 48 121724.3 826.64
Ookami; gcc,openblas,fftw-rd,mvapich 96 231342.1 3580.44
Ookami; gcc,openblas,fftw-rd,mvapich 192 416138.4 3251.73
Stampede2-SKX; icc,mkl,intel-mpi 48 1136890.9 32875.98
Stampede2-SKX; icc,mkl,intel-mpi 96 2245493.6 38268.88
Stampede2-SKX; icc,mkl,intel-mpi 192 4214315.0 80665.64
Stampede2-SKX; icc,mkl,intel-mpi 384 8479441.0 102739.62
Stampede2-SKX; icc,mkl,intel-mpi 768 15933200.0 NA

Fast Fourier Transform (FFTW) Floating-Point Performance, MFLOP per Second

Fast Fourier Transform (FFTW) Floating-Point Performance, MFLOP per Second
config cores average stdev
Ookami; cray,libsci,fftw-cray,mvapich 48 21841.60 444.98
Ookami; cray,libsci,fftw-cray,mvapich 96 26190.35 3764.05
Ookami; cray,libsci,fftw-cray,mvapich 192 33631.62 1905.40
Ookami; cray,libsci,fftw-cray,mvapich 384 41834.94 1996.65
Ookami; cray,libsci,fftw-cray,mvapich 768 73214.14 9931.32
Ookami; gcc,armpl,mvapich 48 297.69 7.64
Ookami; gcc,armpl,mvapich 96 254.19 7.51
Ookami; gcc,armpl,mvapich 192 198.87 18.58
Ookami; gcc,openblas,fftw-rd,mvapich 48 22525.90 1073.28
Ookami; gcc,openblas,fftw-rd,mvapich 96 34312.05 796.50
Ookami; gcc,openblas,fftw-rd,mvapich 192 34377.85 1404.21
Stampede2-SKX; icc,mkl,intel-mpi 48 41157.64 1360.57
Stampede2-SKX; icc,mkl,intel-mpi 96 57192.03 4177.13
Stampede2-SKX; icc,mkl,intel-mpi 192 63627.05 5168.34
Stampede2-SKX; icc,mkl,intel-mpi 384 110116.93 4665.70
Stampede2-SKX; icc,mkl,intel-mpi 768 198047.00 NA

Parallel Matrix Transpose (PTRANS), MByte per Second

Parallel Matrix Transpose (PTRANS), MByte per Second
config cores average stdev
Ookami; cray,libsci,fftw-cray,mvapich 48 21924.53 300.81
Ookami; cray,libsci,fftw-cray,mvapich 96 16241.23 698.59
Ookami; cray,libsci,fftw-cray,mvapich 192 31591.13 1156.06
Ookami; cray,libsci,fftw-cray,mvapich 384 27979.91 951.00
Ookami; cray,libsci,fftw-cray,mvapich 768 40842.54 2958.71
Ookami; gcc,armpl,mvapich 48 22332.35 1527.83
Ookami; gcc,armpl,mvapich 96 23394.42 1652.13
Ookami; gcc,armpl,mvapich 192 29065.72 4013.39
Ookami; gcc,openblas,fftw-rd,mvapich 48 20993.82 4074.78
Ookami; gcc,openblas,fftw-rd,mvapich 96 25365.94 926.88
Ookami; gcc,openblas,fftw-rd,mvapich 192 27290.27 3658.27
Stampede2-SKX; icc,mkl,intel-mpi 48 14774.64 258.69
Stampede2-SKX; icc,mkl,intel-mpi 96 26250.93 1939.42
Stampede2-SKX; icc,mkl,intel-mpi 192 41518.02 3998.56
Stampede2-SKX; icc,mkl,intel-mpi 384 60789.69 1352.06
Stampede2-SKX; icc,mkl,intel-mpi 768 112281.60 NA

MPI Random Access, MUpdate per Second

MPI Random Access, MUpdate per Second
config cores average stdev
Ookami; cray,libsci,fftw-cray,mvapich 48 14.14 0.07
Ookami; cray,libsci,fftw-cray,mvapich 96 4.17 0.07
Ookami; cray,libsci,fftw-cray,mvapich 192 2.95 0.13
Ookami; cray,libsci,fftw-cray,mvapich 384 3.62 0.25
Ookami; cray,libsci,fftw-cray,mvapich 768 5.51 0.50
Ookami; gcc,armpl,mvapich 48 14.18 0.06
Ookami; gcc,armpl,mvapich 96 4.23 0.07
Ookami; gcc,armpl,mvapich 192 3.09 0.13
Ookami; gcc,openblas,fftw-rd,mvapich 48 14.51 0.09
Ookami; gcc,openblas,fftw-rd,mvapich 96 4.15 0.08
Ookami; gcc,openblas,fftw-rd,mvapich 192 3.01 0.09
Stampede2-SKX; icc,mkl,intel-mpi 48 39.34 0.38
Stampede2-SKX; icc,mkl,intel-mpi 96 72.73 0.89
Stampede2-SKX; icc,mkl,intel-mpi 192 129.63 1.12
Stampede2-SKX; icc,mkl,intel-mpi 384 197.66 5.15
Stampede2-SKX; icc,mkl,intel-mpi 768 300.36 NA

Average Double-Precision General Matrix Multiplication (DGEMM) Floating-Point Performance, MFLOP per Second

Average Double-Precision General Matrix Multiplication (DGEMM) Floating-Point Performance, MFLOP per Second
config cores average stdev
Ookami; cray,libsci,fftw-cray,mvapich 48 957623.0 215.99
Ookami; cray,libsci,fftw-cray,mvapich 96 1915288.3 448.16
Ookami; cray,libsci,fftw-cray,mvapich 192 3829407.4 2376.46
Ookami; cray,libsci,fftw-cray,mvapich 384 7662080.0 1103.96
Ookami; cray,libsci,fftw-cray,mvapich 768 15328520.5 15656.11
Ookami; gcc,armpl,mvapich 48 1584897.2 39365.73
Ookami; gcc,armpl,mvapich 96 3147560.7 87734.41
Ookami; gcc,armpl,mvapich 192 6469176.4 153449.66
Ookami; gcc,openblas,fftw-rd,mvapich 48 141221.6 831.20
Ookami; gcc,openblas,fftw-rd,mvapich 96 284081.2 91.22
Ookami; gcc,openblas,fftw-rd,mvapich 192 567925.2 329.78
Stampede2-SKX; icc,mkl,intel-mpi 48 2143204.8 57346.26
Stampede2-SKX; icc,mkl,intel-mpi 96 4117100.5 114732.72
Stampede2-SKX; icc,mkl,intel-mpi 192 8202844.8 122461.53
Stampede2-SKX; icc,mkl,intel-mpi 384 16244382.7 173185.84
Stampede2-SKX; icc,mkl,intel-mpi 768 32680857.6 NA

Average STREAM ‘Triad’ Memory Bandwidth, MByte per Second

Average STREAM ‘Triad’ Memory Bandwidth, MByte per Second
config cores average stdev
Ookami; cray,libsci,fftw-cray,mvapich 48 645333.8 1154.26
Ookami; cray,libsci,fftw-cray,mvapich 96 1282324.6 8296.91
Ookami; cray,libsci,fftw-cray,mvapich 192 2518920.1 9364.26
Ookami; cray,libsci,fftw-cray,mvapich 384 4922177.4 42885.82
Ookami; cray,libsci,fftw-cray,mvapich 768 10174997.0 1349951.52
Ookami; gcc,armpl,mvapich 48 619270.1 913.57
Ookami; gcc,armpl,mvapich 96 1324185.3 308767.30
Ookami; gcc,armpl,mvapich 192 2447830.4 8878.91
Ookami; gcc,openblas,fftw-rd,mvapich 48 769305.0 475503.49
Ookami; gcc,openblas,fftw-rd,mvapich 96 1231082.6 2429.17
Ookami; gcc,openblas,fftw-rd,mvapich 192 2446059.1 14620.47
Stampede2-SKX; icc,mkl,intel-mpi 48 149561.6 2139.03
Stampede2-SKX; icc,mkl,intel-mpi 96 295010.8 3534.32
Stampede2-SKX; icc,mkl,intel-mpi 192 588145.2 8726.60
Stampede2-SKX; icc,mkl,intel-mpi 384 1165618.1 14617.46
Stampede2-SKX; icc,mkl,intel-mpi 768 2306896.0 NA

Average STREAM ‘Add’ Memory Bandwidth, MByte per Second

Average STREAM ‘Add’ Memory Bandwidth, MByte per Second
config cores average stdev
Ookami; cray,libsci,fftw-cray,mvapich 48 645742.3 839.20
Ookami; cray,libsci,fftw-cray,mvapich 96 1277168.5 10748.98
Ookami; cray,libsci,fftw-cray,mvapich 192 2520599.1 15041.66
Ookami; cray,libsci,fftw-cray,mvapich 384 4939033.3 50882.35
Ookami; cray,libsci,fftw-cray,mvapich 768 9415429.5 1581511.54
Ookami; gcc,armpl,mvapich 48 617645.8 548.59
Ookami; gcc,armpl,mvapich 96 1331239.0 317111.21
Ookami; gcc,armpl,mvapich 192 2464045.2 62964.48
Ookami; gcc,openblas,fftw-rd,mvapich 48 767285.3 473116.35
Ookami; gcc,openblas,fftw-rd,mvapich 96 1231137.7 7630.67
Ookami; gcc,openblas,fftw-rd,mvapich 192 2441668.9 11300.10
Stampede2-SKX; icc,mkl,intel-mpi 48 148317.4 2277.77
Stampede2-SKX; icc,mkl,intel-mpi 96 293660.2 3848.26
Stampede2-SKX; icc,mkl,intel-mpi 192 584512.4 8077.76
Stampede2-SKX; icc,mkl,intel-mpi 384 1163158.5 12300.48
Stampede2-SKX; icc,mkl,intel-mpi 768 2289453.0 NA

Average STREAM ‘Copy’ Memory Bandwidth, MByte per Second

Average STREAM ‘Copy’ Memory Bandwidth, MByte per Second
config cores average stdev
Ookami; cray,libsci,fftw-cray,mvapich 48 572363.2 1155.43
Ookami; cray,libsci,fftw-cray,mvapich 96 1124767.8 16735.73
Ookami; cray,libsci,fftw-cray,mvapich 192 2210858.9 12286.80
Ookami; cray,libsci,fftw-cray,mvapich 384 4437298.4 256054.58
Ookami; cray,libsci,fftw-cray,mvapich 768 8747588.0 1037471.48
Ookami; gcc,armpl,mvapich 48 559510.6 3422.93
Ookami; gcc,armpl,mvapich 96 1120398.4 28846.14
Ookami; gcc,armpl,mvapich 192 2198374.1 11363.45
Ookami; gcc,openblas,fftw-rd,mvapich 48 587084.8 74317.58
Ookami; gcc,openblas,fftw-rd,mvapich 96 1107649.2 5941.62
Ookami; gcc,openblas,fftw-rd,mvapich 192 2199152.9 8676.49
Stampede2-SKX; icc,mkl,intel-mpi 48 131604.4 2595.41
Stampede2-SKX; icc,mkl,intel-mpi 96 260143.3 4468.70
Stampede2-SKX; icc,mkl,intel-mpi 192 515780.0 7833.55
Stampede2-SKX; icc,mkl,intel-mpi 384 1026142.8 11637.72
Stampede2-SKX; icc,mkl,intel-mpi 768 2008185.6 NA

Average STREAM ‘Scale’ Memory Bandwidth, MByte per Second

Average STREAM ‘Scale’ Memory Bandwidth, MByte per Second
config cores average stdev
Ookami; cray,libsci,fftw-cray,mvapich 48 575833.4 673.47
Ookami; cray,libsci,fftw-cray,mvapich 96 1140524.0 10603.19
Ookami; cray,libsci,fftw-cray,mvapich 192 2246822.5 19506.02
Ookami; cray,libsci,fftw-cray,mvapich 384 4348532.0 44578.50
Ookami; cray,libsci,fftw-cray,mvapich 768 8761473.8 1559181.77
Ookami; gcc,armpl,mvapich 48 562939.2 399.65
Ookami; gcc,armpl,mvapich 96 1233086.9 380560.87
Ookami; gcc,armpl,mvapich 192 2220444.3 17681.64
Ookami; gcc,openblas,fftw-rd,mvapich 48 691990.4 407376.64
Ookami; gcc,openblas,fftw-rd,mvapich 96 1117083.4 3450.25
Ookami; gcc,openblas,fftw-rd,mvapich 192 2218277.0 18324.49
Stampede2-SKX; icc,mkl,intel-mpi 48 130680.8 1906.40
Stampede2-SKX; icc,mkl,intel-mpi 96 257326.7 3402.04
Stampede2-SKX; icc,mkl,intel-mpi 192 512876.3 9295.28
Stampede2-SKX; icc,mkl,intel-mpi 384 1017383.1 10706.65
Stampede2-SKX; icc,mkl,intel-mpi 768 2024968.0 NA

Performance per Core (that is MPI Process)

High Performance LINPACK Floating-Point Performance, MFLOP per Second

High Performance LINPACK Floating-Point Performance, MFLOP per Second
config cores average stdev
Ookami; cray,libsci,fftw-cray,mvapich 48 16633.59 121.71
Ookami; cray,libsci,fftw-cray,mvapich 96 12188.48 84.40
Ookami; cray,libsci,fftw-cray,mvapich 192 8522.23 201.26
Ookami; cray,libsci,fftw-cray,mvapich 384 7110.23 319.37
Ookami; cray,libsci,fftw-cray,mvapich 768 5156.47 339.65
Ookami; gcc,armpl,mvapich 48 20189.43 212.67
Ookami; gcc,armpl,mvapich 96 13656.03 795.02
Ookami; gcc,armpl,mvapich 192 8860.40 326.26
Ookami; gcc,openblas,fftw-rd,mvapich 48 2535.92 17.22
Ookami; gcc,openblas,fftw-rd,mvapich 96 2409.81 37.30
Ookami; gcc,openblas,fftw-rd,mvapich 192 2167.39 16.94
Stampede2-SKX; icc,mkl,intel-mpi 48 23685.23 684.92
Stampede2-SKX; icc,mkl,intel-mpi 96 23390.56 398.63
Stampede2-SKX; icc,mkl,intel-mpi 192 21949.56 420.13
Stampede2-SKX; icc,mkl,intel-mpi 384 22081.88 267.55
Stampede2-SKX; icc,mkl,intel-mpi 768 20746.35 NA

Fast Fourier Transform (FFTW) Floating-Point Performance, MFLOP per Second

Fast Fourier Transform (FFTW) Floating-Point Performance, MFLOP per Second
config cores average stdev
Ookami; cray,libsci,fftw-cray,mvapich 48 455.03 9.27
Ookami; cray,libsci,fftw-cray,mvapich 96 272.82 39.21
Ookami; cray,libsci,fftw-cray,mvapich 192 175.16 9.92
Ookami; cray,libsci,fftw-cray,mvapich 384 108.95 5.20
Ookami; cray,libsci,fftw-cray,mvapich 768 95.33 12.93
Ookami; gcc,armpl,mvapich 48 6.20 0.16
Ookami; gcc,armpl,mvapich 96 2.65 0.08
Ookami; gcc,armpl,mvapich 192 1.04 0.10
Ookami; gcc,openblas,fftw-rd,mvapich 48 469.29 22.36
Ookami; gcc,openblas,fftw-rd,mvapich 96 357.42 8.30
Ookami; gcc,openblas,fftw-rd,mvapich 192 179.05 7.31
Stampede2-SKX; icc,mkl,intel-mpi 48 857.45 28.35
Stampede2-SKX; icc,mkl,intel-mpi 96 595.75 43.51
Stampede2-SKX; icc,mkl,intel-mpi 192 331.39 26.92
Stampede2-SKX; icc,mkl,intel-mpi 384 286.76 12.15
Stampede2-SKX; icc,mkl,intel-mpi 768 257.87 NA

Parallel Matrix Transpose (PTRANS), MByte per Second

Parallel Matrix Transpose (PTRANS), MByte per Second
config cores average stdev
Ookami; cray,libsci,fftw-cray,mvapich 48 456.76 6.27
Ookami; cray,libsci,fftw-cray,mvapich 96 169.18 7.28
Ookami; cray,libsci,fftw-cray,mvapich 192 164.54 6.02
Ookami; cray,libsci,fftw-cray,mvapich 384 72.86 2.48
Ookami; cray,libsci,fftw-cray,mvapich 768 53.18 3.85
Ookami; gcc,armpl,mvapich 48 465.26 31.83
Ookami; gcc,armpl,mvapich 96 243.69 17.21
Ookami; gcc,armpl,mvapich 192 151.38 20.90
Ookami; gcc,openblas,fftw-rd,mvapich 48 437.37 84.89
Ookami; gcc,openblas,fftw-rd,mvapich 96 264.23 9.66
Ookami; gcc,openblas,fftw-rd,mvapich 192 142.14 19.05
Stampede2-SKX; icc,mkl,intel-mpi 48 307.81 5.39
Stampede2-SKX; icc,mkl,intel-mpi 96 273.45 20.20
Stampede2-SKX; icc,mkl,intel-mpi 192 216.24 20.83
Stampede2-SKX; icc,mkl,intel-mpi 384 158.31 3.52
Stampede2-SKX; icc,mkl,intel-mpi 768 146.20 NA

MPI Random Access, MUpdate per Second

MPI Random Access, MUpdate per Second
config cores average stdev
Ookami; cray,libsci,fftw-cray,mvapich 48 0.29 0.00
Ookami; cray,libsci,fftw-cray,mvapich 96 0.04 0.00
Ookami; cray,libsci,fftw-cray,mvapich 192 0.02 0.00
Ookami; cray,libsci,fftw-cray,mvapich 384 0.01 0.00
Ookami; cray,libsci,fftw-cray,mvapich 768 0.01 0.00
Ookami; gcc,armpl,mvapich 48 0.30 0.00
Ookami; gcc,armpl,mvapich 96 0.04 0.00
Ookami; gcc,armpl,mvapich 192 0.02 0.00
Ookami; gcc,openblas,fftw-rd,mvapich 48 0.30 0.00
Ookami; gcc,openblas,fftw-rd,mvapich 96 0.04 0.00
Ookami; gcc,openblas,fftw-rd,mvapich 192 0.02 0.00
Stampede2-SKX; icc,mkl,intel-mpi 48 0.82 0.01
Stampede2-SKX; icc,mkl,intel-mpi 96 0.76 0.01
Stampede2-SKX; icc,mkl,intel-mpi 192 0.68 0.01
Stampede2-SKX; icc,mkl,intel-mpi 384 0.51 0.01
Stampede2-SKX; icc,mkl,intel-mpi 768 0.39 NA

Average Double-Precision General Matrix Multiplication (DGEMM) Floating-Point Performance, MFLOP per Second

Average Double-Precision General Matrix Multiplication (DGEMM) Floating-Point Performance, MFLOP per Second
config cores average stdev
Ookami; cray,libsci,fftw-cray,mvapich 48 19950.48 4.50
Ookami; cray,libsci,fftw-cray,mvapich 96 19950.92 4.67
Ookami; cray,libsci,fftw-cray,mvapich 192 19944.83 12.38
Ookami; cray,libsci,fftw-cray,mvapich 384 19953.33 2.87
Ookami; cray,libsci,fftw-cray,mvapich 768 19959.01 20.39
Ookami; gcc,armpl,mvapich 48 33018.69 820.12
Ookami; gcc,armpl,mvapich 96 32787.09 913.90
Ookami; gcc,armpl,mvapich 192 33693.63 799.22
Ookami; gcc,openblas,fftw-rd,mvapich 48 2942.12 17.32
Ookami; gcc,openblas,fftw-rd,mvapich 96 2959.18 0.95
Ookami; gcc,openblas,fftw-rd,mvapich 192 2957.94 1.72
Stampede2-SKX; icc,mkl,intel-mpi 48 44650.10 1194.71
Stampede2-SKX; icc,mkl,intel-mpi 96 42886.46 1195.13
Stampede2-SKX; icc,mkl,intel-mpi 192 42723.15 637.82
Stampede2-SKX; icc,mkl,intel-mpi 384 42303.08 451.00
Stampede2-SKX; icc,mkl,intel-mpi 768 42553.20 NA

Average STREAM ‘Triad’ Memory Bandwidth, MByte per Second

Average STREAM ‘Triad’ Memory Bandwidth, MByte per Second
config cores average stdev
Ookami; cray,libsci,fftw-cray,mvapich 48 13444.45 24.05
Ookami; cray,libsci,fftw-cray,mvapich 96 13357.55 86.43
Ookami; cray,libsci,fftw-cray,mvapich 192 13119.38 48.77
Ookami; cray,libsci,fftw-cray,mvapich 384 12818.17 111.68
Ookami; cray,libsci,fftw-cray,mvapich 768 13248.69 1757.75
Ookami; gcc,armpl,mvapich 48 12901.46 19.03
Ookami; gcc,armpl,mvapich 96 13793.60 3216.33
Ookami; gcc,armpl,mvapich 192 12749.12 46.24
Ookami; gcc,openblas,fftw-rd,mvapich 48 16027.19 9906.32
Ookami; gcc,openblas,fftw-rd,mvapich 96 12823.78 25.30
Ookami; gcc,openblas,fftw-rd,mvapich 192 12739.89 76.15
Stampede2-SKX; icc,mkl,intel-mpi 48 3115.87 44.56
Stampede2-SKX; icc,mkl,intel-mpi 96 3073.03 36.82
Stampede2-SKX; icc,mkl,intel-mpi 192 3063.26 45.45
Stampede2-SKX; icc,mkl,intel-mpi 384 3035.46 38.07
Stampede2-SKX; icc,mkl,intel-mpi 768 3003.77 NA

Average STREAM ‘Add’ Memory Bandwidth, MByte per Second

Average STREAM ‘Add’ Memory Bandwidth, MByte per Second
config cores average stdev
Ookami; cray,libsci,fftw-cray,mvapich 48 13452.96 17.48
Ookami; cray,libsci,fftw-cray,mvapich 96 13303.84 111.97
Ookami; cray,libsci,fftw-cray,mvapich 192 13128.12 78.34
Ookami; cray,libsci,fftw-cray,mvapich 384 12862.07 132.51
Ookami; cray,libsci,fftw-cray,mvapich 768 12259.67 2059.26
Ookami; gcc,armpl,mvapich 48 12867.62 11.43
Ookami; gcc,armpl,mvapich 96 13867.07 3303.24
Ookami; gcc,armpl,mvapich 192 12833.57 327.94
Ookami; gcc,openblas,fftw-rd,mvapich 48 15985.11 9856.59
Ookami; gcc,openblas,fftw-rd,mvapich 96 12824.35 79.49
Ookami; gcc,openblas,fftw-rd,mvapich 192 12717.03 58.85
Stampede2-SKX; icc,mkl,intel-mpi 48 3089.95 47.45
Stampede2-SKX; icc,mkl,intel-mpi 96 3058.96 40.09
Stampede2-SKX; icc,mkl,intel-mpi 192 3044.34 42.07
Stampede2-SKX; icc,mkl,intel-mpi 384 3029.06 32.03
Stampede2-SKX; icc,mkl,intel-mpi 768 2981.06 NA

Average STREAM ‘Copy’ Memory Bandwidth, MByte per Second

Average STREAM ‘Copy’ Memory Bandwidth, MByte per Second
config cores average stdev
Ookami; cray,libsci,fftw-cray,mvapich 48 11924.23 24.07
Ookami; cray,libsci,fftw-cray,mvapich 96 11716.33 174.33
Ookami; cray,libsci,fftw-cray,mvapich 192 11514.89 63.99
Ookami; cray,libsci,fftw-cray,mvapich 384 11555.46 666.81
Ookami; cray,libsci,fftw-cray,mvapich 768 11390.09 1350.87
Ookami; gcc,armpl,mvapich 48 11656.47 71.31
Ookami; gcc,armpl,mvapich 96 11670.82 300.48
Ookami; gcc,armpl,mvapich 192 11449.87 59.18
Ookami; gcc,openblas,fftw-rd,mvapich 48 12230.93 1548.28
Ookami; gcc,openblas,fftw-rd,mvapich 96 11538.01 61.89
Ookami; gcc,openblas,fftw-rd,mvapich 192 11453.92 45.19
Stampede2-SKX; icc,mkl,intel-mpi 48 2741.76 54.07
Stampede2-SKX; icc,mkl,intel-mpi 96 2709.83 46.55
Stampede2-SKX; icc,mkl,intel-mpi 192 2686.35 40.80
Stampede2-SKX; icc,mkl,intel-mpi 384 2672.25 30.31
Stampede2-SKX; icc,mkl,intel-mpi 768 2614.82 NA

Average STREAM ‘Scale’ Memory Bandwidth, MByte per Second

Average STREAM ‘Scale’ Memory Bandwidth, MByte per Second
config cores average stdev
Ookami; cray,libsci,fftw-cray,mvapich 48 11996.53 14.03
Ookami; cray,libsci,fftw-cray,mvapich 96 11880.46 110.45
Ookami; cray,libsci,fftw-cray,mvapich 192 11702.20 101.59
Ookami; cray,libsci,fftw-cray,mvapich 384 11324.30 116.09
Ookami; cray,libsci,fftw-cray,mvapich 768 11408.17 2030.18
Ookami; gcc,armpl,mvapich 48 11727.90 8.33
Ookami; gcc,armpl,mvapich 96 12844.66 3964.18
Ookami; gcc,armpl,mvapich 192 11564.81 92.09
Ookami; gcc,openblas,fftw-rd,mvapich 48 14416.47 8487.01
Ookami; gcc,openblas,fftw-rd,mvapich 96 11636.29 35.94
Ookami; gcc,openblas,fftw-rd,mvapich 192 11553.53 95.44
Stampede2-SKX; icc,mkl,intel-mpi 48 2722.52 39.72
Stampede2-SKX; icc,mkl,intel-mpi 96 2680.49 35.44
Stampede2-SKX; icc,mkl,intel-mpi 192 2671.23 48.41
Stampede2-SKX; icc,mkl,intel-mpi 384 2649.44 27.88
Stampede2-SKX; icc,mkl,intel-mpi 768 2636.68 NA